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When the onsets of three successive sound bursts mark two adjacent time intervals, the 
second time interval can be underestimated when it is physically longer than the first time 
interval by up to 100 ms. This illusion, time-shrinking, is very stable when the first time 
interval is 200 ms or shorter (Nakajima etal., 2004, Perception, 33). Time-shrinking had 
been considered a kind of perceptual assimilation to make the first and the second time 
interval more similar to each other. Here we investigated whether the underestimation 
of the second time interval was replaced by an overestimation if the physical difference 
between the neighboring time intervals was too large for the assimilation to take place; this 
was a typical situation in which a perceptual contrast could be expected. Three experiments 
to measure the overestimation/underestimation of the second time interval by the method 
of adjustment were conducted. The first time interval was varied from 40 to 280 ms, and 
such overestimations indeed took place when the first time interval was 80-280 ms. The 
overestimations were robust when the second time interval was longer than the first time 
interval by 240 ms or more, and the magnitude of the overestimation was larger than 100 ms 
in some conditions. Thus, a perceptual contrast to replace time-shrinking was established. 
An additional experiment indicated that this contrast did not affect the perception of the 
first time interval substantially: The contrast in the present conditions seemed unilateral. 
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INTRODUCTION 

When the onsets of three successive sound bursts mark two 
neighboring time intervals, the second time interval can be under- 
estimated when it is longer than the first time interval by up to 
100 ms. This underestimation, i.e., time-shrinking, is very sta- 
ble when the first time interval is 200 ms or shorter (Nakajima 
etal., 1991, 2004), and has been considered a kind of perceptual 
assimilation. Assimilation and contrast in perceptual paradigms 
often replace each other when the relationship and configuration 
of stimuli are changed systematically (e.g., Helson, 1963; Morinaga 
and Noguchi, 1966). 

Assimilation and contrast may not necessarily be governed 
by a single perceptual mechanism, but they are likely to work 
under one perceptual principle for humans and animals to pro- 
cess information from the environment efficiently and quickly. 
For example, a figure in which luminance is sufficiently higher 
than in the background can be distinguished clearly from the 
background in the visual modality. This process is enhanced by 
contrast, which enlarges the perceptual difference in terms of light- 
ness or color between the figure and the background, as well as by 
assimilation, which homogenizes the lightness or color within the 
figure and within the background (Koffka, 1 935; Shapley and Reid, 
1985). It is also argued that, when two potential objects are sep- 
arated enough spatially from each other (but within a distance 



to keep a mutual interaction), they are likely to be organized as 
two separate wholes which are then contrasted (King, 1988). It 
is widely observed that perceptual assimilation between objects 
gives way to contrast when the difference between these objects is 
increased, and that assimilation can be blocked if the area or the 
group to be assimilated is broken by a boundary (or boundaries; 
e.g., Koffka, 1935; Hamburger, 2005), or by a temporal distance 
(Ikeda and Obonai, 1955). In Ikeda and Obonai's (1955) exper- 
iment, concentric circles with different diameters / and T were 
presented simultaneously for 500 ms using a tachistoscope. The 
diameter of T, whose size was to be judged, was fixed at 30 mm. 
When the physical size of I was similar to that of T, assimila- 
tion took place, but contrast took over when the physical size 
difference was larger (Table 1). The fact that assimilation and 
contrast can both take place in the same experimental context is 
described systematically by Helson (1964). One should note that 
temporal configurations of stimuli can also lead to an assimila- 
tion or contrast of the stimuli (Shigeno, 1991; see also McKenna, 
1984). In our study, assimilation and contrast were manipu- 
lated through modifying the temporal configuration of the sound 
bursts. 

When the difference between close but distinguishable objects 
or events is small, the objects will be seen as part of a homogeneous 
group. If the difference cannot be neglected, the objects or events 
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Table 1 I Underestimation and overestimation of the size of a circle, 
T = 30 mm, caused by another concentric circle, /, as observed by 
Ikeda and Obonai (1955). 



Diameter of T 






30 








Diameter of / 


10 


15 


20 


40 


60 


80 


Overestimation of T 


+1.2 


-0.7 


-1.4 


+0.6 


+0.3 


-0.5 




C 


A 


A 


A 


A 


C 



The values are in millimeters. A, assimilation; C, contrast. 



will instead be perceived in different categories. This is the case 
particularly for the human auditory modality, which is responsible 
for quick and complicated communication sometimes in noisy 
environments without favorable acoustics. 

Linguistic communication depends on the human capacity to 
process strings of categorized elements in time. This requires that 
any pair of sounds or sound patterns should be clearly either the 
same or different (de Saussure, 1966); assimilation and contrast 
must work for the listener to decode speech signals properly (e.g., 
Shigeno, 1991). Temporal aspects of auditory perception are also 
very likely to work in the same manner. Relative lengths of syl- 
lables are categorized in many languages; it is often important 
for the listener to judge, without hesitation, whether or not one 
of two neighboring syllables is longer or shorter than the other. 
When time intervals are presented in concatenation, listeners often 
simplify the patterns reducing small differences, and exaggerat- 
ing larger differences (e.g., Fraisse, 1978, 1982; Povel, 1981). A 
ratio 1:2 or 2:1 seems stable perceptually, which means that the 
second time interval is likely to be overestimated if the neighbor- 
ing time intervals are to be perceived as in a ratio 1:1.7 or 1:1.8 
otherwise. We were interested in whether the extremely stable illu- 
sion of time-shrinking, a unilateral assimilation of a time interval 
to a preceding time interval or preceding time intervals, could 
be grasped in relation to such opposite perceptual processes. We 
thus examined whether a time interval was contrasted, instead 
of assimilated, to a preceding time interval at a certain point 
when the difference between these adjacent time intervals was 
increased step by step. When two adjacent empty time intervals 
fp and fs were presented in this order in our previous research, 
the same fp may have caused both underestimation and overes- 
timation of fs depending on the physical difference between fp 
and Nakajima etal.'s (2004) experiments suggested that this 
possibility is systematic. Table 2 indicates the cases in which both 
underestimation and overestimation reached 20 ms for a fixed fp 
value. 

The present paradigm thus became clear. Time-shrinking typ- 
ically takes place when two time intervals, fp and fs in this order, 
marked by the onsets of three successive sound bursts meet the 
following conditions: 0 < fs — fp < 80 ms, and fp < 200 ms. It had 
been indicated already that overestimation of fs to exaggerate the 
difference between fp and fs could take place when the physical dif- 
ference between the neighboring time intervals, fs — fp, exceeded 
the above range (Nakajima etal, 2004). This problem had never 
been taken up systematically. In order to reveal the mechanism 



Table 2 | Temporal patterns in which time shrinking was replaced by 



overestimation in Nakajima etal. (2004). 


Experiment 1 


UE 


OE 


|160|200| |160|240| 


|160|320| |160|480| 


Experiment 2 


UE 


OE 


|160|220| |160|240| |160|260| |160|280| 


|160|320| 


Experiment 3 




UE 


OE 


|160|200| |160|240| 
|200|240| |200|280| 
|240|280| 


|160|320| 

|200|320| |200|360| 
|240|360| |240|400| 


Experiment 4 


UE 


OE 


|280|360| 
|320|400| 


|280|440| 
|320|480| 



Both the underestimation of a standard time interval, i.e., time shrinking, and 
the overestimation of a longer standard appeared for the same preceding time 
interval in the stimulus conditions indicated in each line. The physical durations of 
two adjacent time intervals fp and f s are indicated as /fp/fg/ in milliseconds. The 
conditions in which the underestimation/overestimation was equal to or above 
20 ms were taken up to specify these temporal patterns. UE, underestimation; 
OE, overestimation. 

of rhythmic organization, however, it seemed of crucial impor- 
tance to examine whether a systematic overestimation of fs would 
replace the underestimation, which we call time-shrinking, if we 
increased the difference fs — fp. 

GENERAL METHODS 

The general framework common to the present experiments is 
described in Figure 1. In the first three experiments, we basically 
followed the paradigm employed in previous studies on time- 
shrinking (e.g., Nakajima etal., 2004), except that we increased 
the range of the standard duration to be judged. In the control 
condition, a time interval, fs, marked by the onsets of two succes- 
sive tone bursts was the standard to be judged. An additional tone 
burst preceded fs in the experimental condition; the effect of the 
preceding time interval, fp, marked by the onsets of this additional 
tone burst and the first marker of fs was studied. The difference in 
subjective duration of fs between the control and the experimental 
condition was measured. 

In the last experiment, Experiment 4, a tone burst did not 
precede but succeeded fs, and the effect of the succeeding time 
interval, fsuc > marked by the onsets of the second marker of fs and 
this additional tone burst was examined in order to interpret the 
results of the first three experiments. This was the experimental 
condition, and no control condition was employed because the 
data of the control condition in Experiment 3 could be reused. 
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Control condition (Experiments 1-3) 



initiation of a 
presentation 



Ml 



JL. 



7/ 



Time 



Experimental condition (Experiments 1-3) 



initiation of a 
presentation 



Experimental condition (Experiment 4) 



initiation of a 
presentation 



FIGURE 1 | Time charts of stimulus patterns. The rectangles represent 
sounds. In the experiments, participants adjusted f c to make its subjective 
duration equal to that of fs. In the experimental conditions of Experiments 



Time 



Time 



1-3, fp was added before f s . In the experimental condition of Experiment 
4, f S |j C was added after f s . Note that all time intervals (f s , fp, fguc an d 
frj) refer to the duration between the onsets of successive sounds. 



The method of adjustment was employed. The participant 
initiated each presentation by clicking a pane on the computer 
screen. A few seconds - the interval was chosen randomly within 
a range - after the clicking, the first tone burst of the standard 
pattern fs, (pits, or fslfsuc was presented. After that, there was 
a period of a few seconds - the interval was again chosen ran- 
domly, and then, another time interval, the comparison, tc, was 
presented with the onsets of two successive tone bursts. The task 
of the participant was to adjust tc to make it equal to fs in 
subjective duration. The participant could change tc by oper- 
ating a screen interface, designed in a way not to give a visual 
hint about the present duration, and the minimum step of the 
adjustment was 1 ms. The participant was allowed to listen to 
the whole sequence as many times as he/she needed until fs and 
tc were perceived as equal, and finished the trial when satisfied. 
The last tc value was recorded as the point of subjective equality, 
PSE. 

EXPERIMENT 1 

This experiment was conducted in 1996. Because we did not have 
an institutional ethical committee for psychological experiments at 
that time, an internal ethical review was impossible, but the exper- 
iment was a part of a research project reviewed by a governmental 
committee to select projects to be funded (as in the acknowl- 
edgments). This experiment is included in the present report 
because this was the first case in which the perceptual phenomenon 
we are going to describe appeared systematically. Our original 



purpose had been to determine the stimulus conditions to inves- 
tigate the effect of sound marker duration on the occurrence of 
time-shrinking (underestimation), for there was a possibility that 
the amount of time-shrinking may be reduced, or the time condi- 
tion for maximum time- shrinking could be shifted, by lengthening 
the markers (see Hasuo etal., 2011). From the present viewpoint, 
however, the experimental data gave us insight into the possibil- 
ity of systematic overestimation of the second of two adjacent 
time intervals. The same fs values were employed with a fp in the 
experimental condition and in isolation in the control condition. 
The PSEs in these conditions were compared to see the amount 
of perceptual overestimation or underestimation of fs caused 
by f P . 

METHODS 

Participants 

The participants were five students, i.e., three males and two 
females, of the Kyushu Institute of Design (the predecessor of 
the Faculty of Design, Kyushu University). They had received 
education for acoustic design, including basic training in music 
performance. They were 20-24 years old, and had normal 
hearing. 

Materials 

Duration markers were pure tone bursts of 1000 Hz and 12, 63, 
or 123 ms with a rise and a fall time of ~2 ms each. These val- 
ues were inexact due to our use of an analog filter to shape the 
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waveform; the inexactness was sufficiently small relative to the 
effect we were measuring. The tone bursts of different durations 
were approximately equal in loudness when presented separately. 
This was realized by conducting preliminary measurements in 
which the participant could listen to any of the three sounds 
by clicking corresponding buttons on the computer screen. The 
stimulus sound was presented always 200 ms after the button 
was clicked. The level of the 12-ms burst, which was very short, 
was fixed at 97 dBA as defined as the level of a continuous tone 
of the same amplitude measured with an artificial ear (Briiel 
and Kjaer 4153), a microphone (Briiel and Kjasr 4134), and a 
sound level meter (Briiel and Kjasr 2209). The levels of the other 
sounds were adjustable, and the participant was instructed to 
equalize the three sounds in terms of loudness. In each trial, 
the adjusted levels of the 63 and 123 ms bursts were recorded. 
The participant performed eight trials, and the median value 
for each sound was employed as the presentation level in the 
main part of the experiment. The presentation levels were 87- 
94 dBA for the 63-ms burst, and 85-93 dBA for the 123-ms 
burst. 

The pure tones were first generated as rectangular pulse series 
before being band-pass filtered between 850 and 1250 Hz (NF 
DV-6BW). This resulted in tone bursts with rise and fall times 
of ~2 ms. The tone bursts were presented to the left ear of the 
participant through an amplifier (JVC AX-Z511) and headphones 
(AKG K141) in a soundproof room. The experimental procedure 
including stimulus generation was controlled by a quiet computer 
without a hard disk drive or a fan (Commodore Amiga 500). 

In the main part of the experiment, the marker duration was 
fixed in each standard pattern, which was marked by two or 
three successive tone bursts, and the comparison time interval was 
always marked by two 12-ms tone bursts. In the standard patterns 
of the experimental condition, tplfs, the preceding time interval, 
fp, was fixed at 160 ms. Both in the control and in the experi- 
mental condition, the standard time interval, fs, was varied from 
120 to 440 ms in steps of 40 ms. The ts duration of 120 ms was 
not possible when the marker duration was longer, i.e., 123 ms; 
this condition was omitted. Thus, there were 58 stimulus patterns: 
[2 (control/experimental) x 2 (marker durations < 63 ms) x 10 
(fs durations) + 1 (marker duration = 123 ms) x 9 (ts dura- 
tions)]. The standard pattern was presented 2300-2500 ms after 
the participant clicked a button on the screen. There was a silence 
of 2700-3300 ms between the offset of the last sound marker of 
ts, and the onset of the first sound marker of fo 

Procedure 

The participant performed four adjustment trials, two in ascend- 
ing series and two in descending series, for each stimulus pattern: 
two replications for both series were performed. One replication 
comprised the first half, and the other the second half of the whole 
measurement. Each replication (= half) consisted of 116 trials, 58 
(stimulus patterns) x 2 (series) in random order, and was divided 
into 9 blocks of 12 or 13 measurement trials, which were pre- 
ceded by two warm-up trials. Preceding the measurement, the 
participant performed 58 training trials, divided into four blocks; 
each stimulus pattern appeared once. Thus, the whole experiment 
consisted of 22 blocks: 4 (training blocks) + 2 (replications) x 9 



(measurement blocks). Each block took around 15-20 min, and 
the whole experiment was carried out over a period of 8 days for 
each participant. 

RESULTS AND DISCUSSION 

We performed a three-way [marker duration x condition (exper- 
imental/control) x fs duration] ANOVA utilizing the PSEs for 
ts = 160-480 ms. Since it is commonplace that PSEs change as 
fs changes, we will not detail the main effect of this factor nei- 
ther here nor in the following experiments; its main effect was 
always significant (p < 0.001). The main effect of marker duration 
was significant, F(2,8) = 21.902, p < 0.01, = 0.846. Ryan's 
post hoc test showed that the difference between all combinations 
of marker duration, i.e., 12 and 123; 63 and 123; and 12 and 
63 ms; was significant (p < 0.05). The interaction between condi- 
tion (experimental/control) and fs duration was also significant, 
f(8,32) = 4.614, p< 0.01, rjp = 0.536. This interaction should be 
related to the assimilation and contrast of fs to fp. The main effect 
of condition (experimental/control) and the other interactions 
were not significant (p > 0.05). 

The PSEs in the control condition were very close to the physical 
values of fs (Figure 2). Slight deviations appeared systematically, 
however: PSEs of shorter duration tended to be longer than the 
physical values of fs. This kind of time errors sometimes appear 
in the literature of time perception (Woodrow, 1951; Eisler etal., 
2008). The PSEs tended to be slightly longer when the marker 
duration was longer, but the present data do not offer much infor- 
mation on this issue. This issue should be investigated intensively 
in the future in order to understand rhythm perception in speech 
or music. Hasuo etal. (2011, 2012) reported that inter-onset time 
intervals up to 360 ms tended to be perceived as longer when the 
duration of the sound markers to terminate the time intervals were 
longer. This was the case whether the time interval to be judged 
was isolated or neighboring another time interval. The duration 
of the sound markers to initiate the time intervals showed similar 
effects, but in a more unstable manner. 

The PSEs in the control and in the experimental condition 
differed systematically. The experimental PSEs were smaller than 
the corresponding control PSEs when fs = 200 or 240 ms, i.e., 
when fs — fp = 40 or 80 ms: fs was underestimated showing time- 
shrinking in a typical manner. However, the difference between 
the control and the experimental condition was reversed when 
ts was longer: the experimental PSEs were systematically greater 
than the control PSEs when fs > 320 ms. Thus, time -shrinking as 
assimilation of ts to fp appeared when the difference between these 
neighboring time intervals was small, and gave way to contrast of 
ts to fp when the difference was large. 

The above tendency appeared in similar ways in all the marker 
conditions between the control and the experimental PSEs despite 
the fact that the control PSEs increased slightly, but clearly, if 
the sound marker duration was increased. The contrast appeared 
as overestimation of fs in the experimental condition against the 
control condition. The PSEs were already lengthened in the control 
condition if the sound markers were longer, and they became 
even longer - were overestimated further - in the experimental 
condition. Furthermore, the amount of overestimation was larger 
when the duration markers were longer. This is in contrast with 
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FIGURE 2 | Mean PSEs obtained from five participants in Experiment 1. 

PSE corresponds to the duration of f c that was perceived to be equal to the 
duration of fg.The results for marker durations 63 and 123 ms were raised by 



300 and 600 ms, respectively, in this graph for clarity. The physical values of 
fg (the points of objective equality) are indicated by dotted lines. Error bars 
represent standard deviations between participants. 



the fact that the magnitude of time-shrinking - underestimation - 
is often smaller when longer markers are used (Yamashita and 
Nakajima, 1999; Hasuo etal, 2011), as was the case also in the 
present experiment. 

The overestimation, as represented by the difference in the PSEs 
between the control and the experimental condition, seemed to 
have a local peak when % = 320 ms for all the marker durations. 
This tendency was peculiar and robust, but we leave this issue for 
future research. 

To test whether the common tendency in overestimation pat- 
tern (i.e., the difference between the control and the experimental 
PSEs over the % duration range) across different marker dura- 
tions was statistically significant, we conducted a Friedman test 
(e.g., Siegel and Castellan, 1988) utilizing the mean overestima- 
tion values for each marker duration. There was a statistically 
significant tendency in overestimation, x. 2 (8) = 23.644, p = 0.003. 
To examine whether the overestimation patterns had a common 
tendency even when the influence of time-shrinking (the nega- 
tive overestimation at ts — fp = 40 or 80 ms ) was cancelled, we 
also performed the same Friedman test without the conditions in 
which t$ — fp = 40 or 80 ms. The tendency in overestimation 
pattern was significant again, x 2 (6) = 17.714, p = 0.007. The sta- 
tistical significance in this additional Friedman test confirmed that 
the overestimation patterns had a common tendency even without 
the influence of time-shrinking. 

EXPERIMENT 2 

Experiments 2-4 were part of a research project approved by 
the research ethics committee of the Faculty of Design, Kyushu 
University, in 2010. Experiment 1 and our previous data on 
time-shrinking (e.g., Nakajima etal., 2004) revealed that the 



underestimation of a time interval that appeared as assimilation of 
fs to fp often gave way to contrast when % — fp > 120 ms. Because 
we did not have systematic data indicating this effect except in 
Experiment 1, we decided to conduct an experiment in which fs 
was varied in a larger range (up to 640 ms). For fp, we chose 
three values: 80, 120, and 160 ms. Time-shrinking appears most 
stably in this range of tp (Nakajima etal., 2004; Miyauchi and 
Nakajima, 2005), and we first needed experimental data under 
such conditions. One of the things we were interested in was 
whether any overestimation would appear for fp = 120 ms; there 
had been occasional cases in previous data in which ts had been 
overestimated for fp = 80 or 160 ms, but no such cases ever for 
fp = 120 ms. Most importantly, we wanted to see whether the 
typical time-shrinking, which was expected reliably if fs — fp = 40 
or 80 ms, would give way to contrast, i.e., overestimation 
of f s . 

METHODS 

Participants 

Five students of Kyushu University, three males and two females, 
participated. One of them had been educated to become a high- 
school music teacher, and three of them had received education for 
acoustic design, including basic training in music performance. 
The fifth one was an amateur musician who had been playing 
percussions for 8 years. They were 21-46 years old. 

Materials 

Duration markers were pure tone bursts of 1000 Hz and 10 ms with 
cosine-shaped rise and fall times of 5 ms each, with no steady-state 
part. Their level was 80 dBA as defined as the level of a contin- 
uous tone of the same amplitude measured with an artificial ear 
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(Briiel and Kjasr 4153), and a sound level meter (Node 2072 or 
2075). The tone bursts were presented diotically to the partic- 
ipant through an amplifier (Stax SRM-323A) and headphones 
(Stax SR-303) in a soundproof room. The experimental proce- 
dure including stimulus generation was controlled by a computer 
(Frontier KZFM71/N) with an audio processor (Onkyo Wavio 
SE-U55GX). Stimulus patterns were generated digitally (16 bits; 
a sampling frequency of 44 100 Hz), and went through a 16-kHz 
low-pass filter (NF DV-8FL) to avoid aliasing. 

In the standard patterns of the experimental condition, fplfs, 
the preceding time interval, fp, was 80, 120, or 160 ms, for which 
time-shrinking had occurred typically in previous studies (e.g., 
Nakajima etal., 2004). Overestimation of t$ had been recorded 
for fp = 80 and 160 ms, but only in a few stimulus patterns for 
each fp value, and only up to 30 ms, except for Experiment 1 of 
the present article. For fp = 120 ms, no related measurements 
had been done before. The standard time interval, fs, was varied 
from 40 to 640 ms in steps of 40 ms both in the experimental 
and in the control condition. There were 64 stimulus patterns: 
4 (1 control + 3 fp durations) x 16 (fs durations). The standard 
pattern was presented 1500-2500 ms after the participant initiated 
a presentation. There was an interval of 3000-4000 ms between 
the onsets of fs and fc. 

Procedure 

The participant performed two adjustment trials, one in ascending 
series and one in descending series, for each stimulus pattern, 
and thus 128 trials in total: 64 (stimulus patterns) x 2 (series), 
which were arranged in random order and divided into 1 1 blocks 
of 11 or 12 measurement trials preceded by two warm-up trials. 
Before the measurement, the participant performed one training 
session of 1 6 trials, for which representative stimulus patterns were 



employed. Thus, the whole experiment consisted of 12 blocks: 1 
(training block) + 11 (measurement blocks). Each block took 
around 15-30 min, and the whole experiment was carried out 
over a period of 2-3 days for each participant. 

RESULTS AND DISCUSSION 

We performed a two-way [condition (1 control + 3 fp dura- 
tions) x fs duration] ANOVA utilizing the PSE values. The main 
effect of condition (1 control + 3 fp durations) was significant, 
f (3,12) = 8.624, p < 0.01, nj; = 0.683, and so was the interaction 
between condition ( 1 control + 3 fp durations) and fs duration, 
F (45,180) = 3.344, p < 0.01, T)p = 0.455. 

The PSEs in the control condition were close to the physical val- 
ues of fs, but slight deviations appeared systematically (Figure 3). 
PSEs of longer duration tended to be longer than the physical val- 
ues of fs, and this was not consistent with the tendency observed 
in Experiment 1. In both cases, however, the observed devia- 
tions were extremely small, and can be neglected for our present 
purpose. 

The PSEs in the control and in the experimental condition 
differed systematically. The experimental PSEs were smaller when 
f s = fp + 40 or fp + 80 ms, indicating a robust occurrence of time- 
shrinking. This underestimation of fs, however, was replaced by 
overestimation, whose highest magnitude reached above 50 ms, 
when fs > fp + 240 ms for all the fp values. Thus, as in Experiment 
1, time-shrinking appeared when the difference between fs and fp 
was 40 or 80 ms, and contrast of fs to fp took over when fs was 
lengthened. 

When fp = 160 ms as in Experiment 1, the overestimation again 
seemed to have a local peak when fs = 320 ms. This tendency 
indeed seems interesting, but is an issue to be investigated in the 
future. 




FIGURE 3 | Mean PSEs obtained from five participants in 
Experiment 2. PSE corresponds to the duration of tQ that was 
perceived to be equal to the duration of f s . The results for fp = 120 
and 160 ms were raised by 300 and 600 ms, respectively, in this 



graph for clarity. The physical values of f s (the points of objective 
equality) are indicated by dotted lines, on which fp values are 
indicated by arrows. Error bars represent standard deviations between 
participants. 
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To test whether the common tendency in overestimation pat- 
tern across different fp values (i.e., the underestimation of fs 
when fs = fp + 40 or fp + 80 ms and the overestimation when 

> fp + 240 ms, observed for all fp values) was statistically 
significant, we conducted a Friedman test utilizing the mean over- 
estimation values for each fp duration (= 80, 120, or 160 ms). 
There was a statistically significant tendency in overestimation 
depending on the difference between the two neighboring inter- 
vals (f s - fp = -40 to 480 ms), x 2 (13) = 34.505, p = 0.001. 
As in Experiment 1, we also performed the same Friedman test 
without the conditions in which fs — fp = 40 or 80 ms, where time- 
shrinking should have taken place. The tendency in overestimation 
pattern was significant again, x. 2 ( 1 1 ) = 27.4 1 0, p = 0.004. 

EXPERIMENT 3 

Time-shrinking almost disappeared, although not completely, 
when fp was above 300 ms (Nakajima etal., 2004, Figure 11). Our 
next step was to examine whether the tendency for fs to be under- 
estimated when fs = fp + 40 or fp + 80 ms and overestimated 
when fs was further lengthened, as observed in Experiments 1 and 
2, would appear entirely in the fp range in which we could expect 
time-shrinking. Because the overestimation of fs appeared in a 
very wide range of fs in Experiment 2, we made the range of fs in 
the present experiment even wider. 

METHODS 

Participants 

Six students of Kyushu University, three males and three females, 
participated. Four of them had taken part in Experiment 2, but 
there had been an interval of at least 3 months. One of the partic- 
ipants had been educated to become a high-school music teacher, 
and four of them had received education for acoustic design, 
including basic training in music performance. The sixth one was 
an amateur musician who had been playing percussions for 8 years. 
They were 20-46 years old. 

Materials 

Duration markers and the way of presentation were the same as 
in Experiment 2. In the standard patterns of the experimental 
condition, fplfs, fp = 40, 120,200, or 280 ms, where time-shrinking 
had occurred clearly (Nakajima et al., 2004). Overestimation of fs 
had been recorded for these fp values, but only in a handful of 
stimulus patterns, and only up to 30 ms, except for Experiment 2 
of the present article. The standard time interval, fs, was varied 
from 40 to 1000 ms in steps of 80 ms both in the control and 
in the experimental condition. There were 65 stimulus patterns: 
5 (1 control + 4 fp durations) x 13 (fs durations). The standard 
pattern was presented 1500-2500 ms after the participant initiated 
a presentation. There was an interval of 4000-5000 ms between 
the onsets of fs and tc- 

Procedure 

The participant performed two adjustment trials, one in ascending 
series and one in descending series, for each stimulus pattern, and 
thus 130 trials in total: 65 (stimulus patterns) x 2 (series), which 
were arranged in random order and divided into 10 blocks of 13 
measurement trials preceded by two warm-up trials. Before the 
measurement, the participant performed 15 training trials, for 



which representative stimulus patterns were employed. Thus, the 
whole experiment consisted of 14 blocks: 1 (training block) + 13 
(measurement blocks). Each block took around 15-30 min, and 
the whole experiment was carried out over a period of 2-3 days 
for each participant. 

RESULTS AND DISCUSSION 

We performed a two-way [condition (1 control + 4 fp dura- 
tions) x fs duration] ANOVA utilizing the PSE values. The main 
effect of condition (1 control + 4 fp durations) was significant, 
f(4,20) = 6.450, p < 0.01, r|j; = 0.563, and so was the interaction 
between condition ( 1 control + 4 fp durations) and fs duration, 
F (48,240) = 2.539, p < 0.01, T)p = 0.337. 

The PSEs in the control condition were very close to the physical 
values of fs (Figure 4). Although slight deviations appeared again 
systematically, they were almost unrecognizable in the graphs 
except for the longest fs values, for which PSEs tended to be slightly 
shorter than the corresponding points of objective equality. 

The PSEs in the control and in the experimental condition dif- 
fered systematically. The experimental PSEs were conspicuously 
smaller when f s = fp + 80 ms, again showing the robustness of 
time-shrinking. For fp = 120, 200, and 280 ms, the underestima- 
tion of fs was replaced by overestimation when fs was longer. When 
fs > fp + 240 ms, the PSEs in the experimental condition were 
never smaller than those in the control condition. For fp = 200 
and 280 ms, the overestimation reached above 100 ms, which 
is comparable to the temporal illusions Israeli (1930) reported 
in the visual modality. For fp = 40 ms, no clear overestima- 
tion appeared. When the same preceding interval duration was 
employed in Nakajima et al.'s (2004) Experiment 1, however, some 
overestimation appeared stably, although the amount was only 
about 10 ms, and it would be safer to reserve any clear conclu- 
sion for this fp value. In the present experiment, time-shrinking 




t s [ms] 

FIGURE 4 | Mean PSEs obtained from six participants in Experiment 3. 

PSE corresponds to the duration of that was perceived to be equal to 
the duration of f s .The results for fp = 120, 200, and 280 ms were raised by 
300, 600, and 900 ms, respectively, in this graph for clarity. The physical 
values of fg (the points of objective equality) are indicated by dotted lines, 
on which fp values are indicated by arrows. Error bars represent standard 
deviations between participants. 
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appeared when the difference between fs and fp was 80 ms, and 
contrast of ts to fp took over when t$ was lengthened except when 
t P = 40 ms. 

As in Experiment 2, we conducted a Friedman test utilizing 
the mean overestimation values for each fp duration to exam- 
ine whether the common tendency in the overestimation pattern 
across different fp values 40, 120, 200, and 280 ms was statis- 
tically significant. There was a statistically significant tendency 
in overestimation depending on the difference between the two 
neighboring intervals (fs - fp = 0-720 ms), \ 2 {9) = 25.855, 
p = 0.002. We also performed the same Friedman test, but 
without the (negative) overestimations in conditions in which 
f s — fp = 80 ms, where time-shrinking should have taken 
place. The tendency in overestimation pattern was significant 
again, x 2 W = 19.600, p = 0.012, confirming that the over- 
estimation patterns had a common tendency even when the 
influence of time-shrinking (the dip at fs — fp = 80 ms) was 
cancelled. 

EXPERIMENT 4 

The overestimation of fs took place to a remarkable degree in 
Experiments 1-3. It seemed necessary to have some idea on 
whether this strong contrast, which was observed between the 
two neighboring time intervals, fi and f2 in this order, for the 
perception of f2, also affected the perception of t\. Because time- 
shrinking was a unilateral illusion affecting mainly the perception 
of f2, we first examined whether, and if so how, the under- 
estimation of f2 gave way to overestimation, and this indeed 
happened to a remarkable degree. Now it seemed important to 
check whether this contrast was unilateral or bilateral. In the 
present study, we just conducted an experiment to be appended 
to Experiment 3, but this would help us to interpret the present 
results. We picked up six temporal patterns of two neighbor- 
ing time intervals in which contrast between them had caused 
overestimation of f2 (fs in Experiment 3). Then PSEs of fi were 
measured for these patterns. For example, we took up a pattern 
of t\ = 200 ms and f2 = 680 ms, in which f2 had been over- 
estimated by more than 100 ms in Experiment 3. In the present 
experiment, we were interested in whether or not the same mech- 
anism of contrast (bilaterally) led to the underestimation of fi 
making its PSE shorter than the control value. Because t\ was 
the standard time interval, it is called fs, and the succeeding time 
interval f2 is called tsuc m the present report. In other words, we 
used the same temporal patterns of two neighboring time inter- 
vals marked by three successive sounds as in Experiment 3, and 
the key difference was that tc was adjusted to match the per- 
ceived duration of the first interval instead of that of the second 
interval. 

Due to the unavailability of a certain potential participant, we 
decided to employ five of the six participants from Experiment 3, 
making it still possible to reuse the data in the control condition 
of Experiment 3. 

METHODS 

Participants 

Five students, three males and two females, participated in this 
experiment after participating in Experiment 3. There had been an 



interval of at least 1 month between these experiments. They were 
21-25 years old. Four of them had taken part in Experiment 2, but 
there had been an interval of at least 3 months. Four of them had 
received education for acoustic design, including basic training in 
music performance. The fifth one was an amateur musician who 
had been playing percussions for 8 years. 

Materials 

Six stimulus patterns were chosen from the stimulus patterns in 
Experiment 3. In the standard patterns of the experimental con- 
dition, fslfsuc, the standard time interval, fs, was 120, 200, or 
280 ms; these values had been chosen for fp in Experiment 3. The 
control patterns of these fs values in Experiment 3 were regarded 
as the virtual control patterns of the present experiment, and 
thus the control data of the present participants were reused. The 
succeeding time interval, fsuc, was 440 or 680 ms; fsuc in any 
stimulus pattern would have been overestimated stably if it had 
been the standard time interval. There were six stimulus patterns 
not including the virtual control patterns. The standard pattern 
was presented 1500-2500 ms after the participant initiated a pre- 
sentation. There was a silence of 4000-5000 ms between the onsets 
of fs and tc- 

Procedure 

The participant performed two adjustment trials, one in ascend- 
ing series and one in descending series, for each stimulus pattern, 
and thus 12 trials in total arranged in random order. Four trials 
were conducted first for training and a warm-up, and the mea- 
surement trials followed without a break. The experiment took 
around 20 min. 

RESULTS AND DISCUSSION 

We performed a two-way [condition (1 control + 2 fsuc dura- 
tions) x ts duration] ANOVA utilizing the PSE values. Neither the 
main effect of condition (1 control + 2 fsuc durations) nor the 
interaction between condition (1 control + 2 fsuc durations) and 
t s duration was significant, F(2,8) = 0.222, p > 0.05, x\~ = 0.052; 
f(4,16) = 2.740, p> 0.05, T)^ = 0.407, respectively. 

The PSEs in the control condition were almost equal to the 
physical values of ts (Figure 5). The PSEs in the control and in the 
experimental condition were very close to each other. Underesti- 
mation of ts that should have occurred if the systematic contrast 
in Experiment 3 were bilateral did not take place to any observable 
degree. Although we do not have sufficient data to conclude that 
the systematic contrast observed in Experiments 1, 2, and 3 was 
unilateral, the underestimation of fs was almost negligible even in 
conditions in which the mechanism of contrast must have worked 
clearly. The observed contrast was at least very close to unilateral. 

GENERAL DISCUSSION 

The purpose of the present study was to observe the overesti- 
mation of an empty time interval caused by a preceding time 
interval. The conditions in the present study were comparable 
to the conditions in which time-shrinking had been reported to 
take place. We had assumed that time-shrinking was a unilateral 
perceptual assimilation of an empty time interval to a shorter 
preceding time interval. One may wonder whether the poten- 
tial rhythmic regularity of presented patterns may be playing a 
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FIGURE 5 | Mean PSEs obtained from five participants in 
Experiment 4. PSE corresponds to the duration of trj that was perceived to 
be equal to the duration of f s . Some dots are deviated slightly from the 
scale marks on the horizontal axis to avoid being invisible in the graph. The 
physical values of f s (the points of objective equality) are indicated by a 
dotted line. Error bars represent standard deviations between participants. 



crucial role, but this idea is not supported by the fact that time- 
shrinking took place even when the preceding time interval and 
the time interval to be judged were separated in time (Sasaki et al., 
2002). The assumption of "assimilation" itself is not related to 



any particular perceptual mechanism directly, but it can give us 
a wider view of the observed facts. Because perceptual assimila- 
tion and contrast often appear in the same context, we examined 
whether a change from the unilateral assimilation, time-shrinking, 
could give way to contrast when the difference between the neigh- 
boring time intervals was increased. The range of the first time 
interval that can cause time-shrinking has been determined sys- 
tematically in previous studies, and it has been established that the 
illusion takes place only when the difference between the neigh- 
boring time intervals was smaller than ~100 ms. This knowledge 
made it possible for us to focus onto the stimulus conditions in 
which contrast was likely to take place. As a result, overestima- 
tion of the second of the neighboring time intervals appeared 
systematically. 

When fp precedes and neighbors fs causing time-shrinking (i.e., 
the systematic underestimation of fs), an overestimation of fs was 
observed when fs was lengthened. The only exception was when fp 
was set to be extremely short, i.e., fp = 40 ms. The overestimation 
of fs never disappeared when fs — fp > 240 ms for the other 
fp values. The overestimation as a function of fs — fp showed a 
common tendency across the different fp values (Figure 6), which 
was confirmed by the Friedman tests. 

What we had not expected was that the contrast appeared in 
such a wide range and to such a large degree. About the range of 
the second time interval, we have already reached 1 s as the longest 
duration. It will be very important in the future to determine 
the upper limit of the range in which the overestimation takes 
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FIGURE 6 | Overestimations of t s as functions of t s — tp in Experiments 1, 2, and 3. The overestimations were calculated as the increases of PSEs due to 
the presence of fp. 
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place, but this would require a new experimental paradigm because 
we can easily reach the perceptual limit; when a time interval is 
equal to or above 1.5-2 s, it is often difficult to grasp the whole 
interval perceptually, or to perceive it as a part of a single rhythm 
pattern (Fraisse, 1978; Nakajima et al, 1980; Warren, 2008; see also 
Grondin, 2012 for a perceptual limit at around 1.5 s). 

The amount of the overestimation sometimes surpassed 
100 ms. Although similar overestimation had appeared occa- 
sionally in previous studies on unilateral or bilateral assimilation 
between neighboring time intervals, the positive overestimation 
had never reached 40 ms except in the present Experiment 1. 
It turned out now that the overestimation can be larger than 
time-shrinking in terms of deviation from the control PSEs in 
milliseconds. Although we had (re)started this study as some- 
thing to be added to the studies of perceptual assimilation between 
time intervals, the overestimation of the second time interval now 
appeared as a phenomenon worth investigating more systemati- 
cally in different series of studies. It is particularly necessary to 
examine whether the present results can be related to the fact that 
a successive presentation of two objects (as would be inevitable 
for time intervals) could facilitate the perceptual contrast between 
them (Ikeda and Obonai, 1955). 

Fraisse (1978, 1982) argued that rhythm patterns were often 
based on two dominant duration values, and that they were mostly 
in a ratio 1:2, and occasionally in 1:3; in Western music, the shorter 
durations were typically 150-290 ms, and the longer durations 
300-900 ms. This could explain the overestimation in the present 
study in some cases. Perceptual contrast can often take place as, 
or as a result of, categorical perception, although it is often diffi- 
cult to relate results in different paradigms (Repp and Liberman, 
1987). If a shorter duration and a longer duration neighboring 
each other are to be perceived as in different perceptual categories, 
i.e., in the short-duration category and in the long-duration cat- 
egory, this can be an aspect, or a cause, of perceptual contrast. In 
the present experiments, the first time interval was always below 
290 ms, and the second time interval was mostly above 300 ms 
when it was overestimated. Most cases in which fp caused the over- 
estimation of fs can be interpreted by the fact that fp < 300 < fs ms, 
which should have caused the time intervals to be relocated in dif- 
ferent perceptual categories, which then should have led to the 
overestimation of ts. This interpretation describes the general ten- 
dency of the present data rather well, and is worth investigating 
further. However, the categorical boundary at about 300 ms is 
hardly a part of common knowledge, and a systematic investiga- 
tion on this issue should be the first thing necessary to pursue this 
path. 

Another possible explanation related to a categorical aspect 
of temporal perception is related to the studies of Miyauchi and 
Nakajima (2005) and ten Hoopen et al. (2006; see also Sasaki et al., 
1998; and Miyauchi and Nakajima, 2007). They presented auditory 
temporal patterns as used in the present experiments to partici- 
pants, and established a 1:1 category, i.e., a perceptual category 
in which the neighboring time intervals are perceived as equal 
to each other even when the physical difference between them 
is greater than the differential limen. One of the boundaries of 
this category was very close to the point at which time-shrinking 
reaches its maximum, i.e., the point at which fs — fp — 80 ms; 



the overestimation of fs typically appeared when the difference 
between fp and fs doubled this value. This is an idea to be kept 
for future research, but some difficulty arises if we are to explain 
why the contrast appeared not immediately when the 1:1 cate- 
gory gave way but when the difference between fp and fs increased 
further. 

Although human listeners are able to discriminate temporal 
patterns more precisely than specified by musical notations, they 
tend to establish perceptual categories represented by simple ratios 
between neighboring durations as in musical notations (Honing, 
2013; see also Povel, 1981). It is understandable that humans have 
to categorize temporal patterns in order to memorize, imitate, or 
respond quickly to them. This might lead to the human listen- 
ers' tendency to make the subjective ratios between neighboring 
durations closer to those in the prototypical patterns, which are 
made of simple ratios. As Fraisse (1978, 1982) indicated, the per- 
ceptual system tends to make the perceived ratio closer to a simple 
integral ratio as 1:1 or 1:2 (see also Honing, 2013). Supporting 
this observation, Nakajima (1979) reported that a pattern of two 
neighboring time intervals of 80 and 160 ms was perceived in 
ratios close to 1:1 or 1:2 avoiding intermediate cases, and Povel 
(1981) systematically showed the stability of the ratio l:2inatask 
to reproduce repeated temporal patterns. It is very likely that a 
temporal pattern to be perceived as in a ratio 1:1.7, for example, 
is perceptually distorted to be closer to 1:2, causing the overes- 
timation of the second time interval. However, this alone cannot 
account for the overestimation observed in the present study. Sup- 
pose that fp = 200 ms in the paradigm of Experiments 1, 2, and 
3. Nakajima etal. (1988, Table 1) showed that the temporal pat- 
tern 2001400 ms was perceived in a ratio 1:1.78, i.e., closer to 1:1 
than the physical ratio 1:2, and this tendency was in line with their 
psychophysical hypothesis. If the perceptual system tries to shift 
toward a simpler ratio 1:2, then the second time interval may be 
overestimated. Although this hypothesis seemed attractive, a fur- 
ther examination of our own data was not very promising. For 
example, in the pattern 2001520 ms in Experiment 3, which would 
correspond to a subjective ratio 1:2.14 according to Nakajima 
etal.'s (1988) psychophysical hypothesis, the second time interval 
should be underestimated to make the subjective ratio closer to 1 :2. 
In reality, this pattern still caused the overestimation of fs. As in 
this example, the overestimation took place more widely than was 
predicted from the perceptual system's tendency toward simpler 
ratios. No literature or experimental data are within the present 
authors' knowledge about the mechanism to show such perceptual 
tendencies, and the present experimental paradigm will be useful 
to solve this problem in the future. It should also be interesting for 
future research to examine the assimilation and contrast in a more 
complex context (e.g., Jones and McAuley, 2005). 

One may wonder whether the overestimation of fs in the 
present results can be explained by time-order error (TOE), which 
is a phenomenon observed in psychophysics in general. Previous 
studies reported that TOE is expected to be positive for short dura- 
tions of a few hundred milliseconds, as the durations utilized as fp 
in the present experiment (although it should be noted that in TOE 
studies two successive and distinct intervals are used instead of two 
intervals sharing a common marker; Woodrow, 1951; Eisler etal., 
2008). This means that the duration of fp should be overestimated 
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relative to fs.. In the present experiments, fs was overestimated 
(Experiments 1-3) but fp was not (Experiment 4). It seems diffi- 
cult to explain the tendencies of the present results with TOEs as 
reported in classical literature (e.g., Hellstrom, 1985). 

We began the present study in order to observe what would hap- 
pen if the temporal patterns causing time-shrinking were modified 
by lengthening the second of the two neighboring time intervals. 
This tactic worked well to find clear cases in which assimilation 
gave way to contrast. As the overestimation was so systematic, 
however, it will be necessary in the future to investigate this issue 
in a broader paradigm apart from time-shrinking. First, it is of 
some interest whether the first of the neighboring time intervals is 
also affected when the second time interval is overestimated. The 
results of Experiment 4 were negative, suggesting that the con- 
trast was unilateral, but we need further studies on this point. It 
attracts our interest as well whether any perceptual contrast would 
take place if the temporal order between the longer and the shorter 
time interval is reversed. Although there are some previous data 
for some speculation, we basically need a new set of experiments. 

Arao etal. (2000) showed that time-shrinking occurred also 
in the visual modality, and it took place when the neighboring 
time intervals fp and fs, in this order, had the relationship fp < fs 
< ~1.8 x fp. If we see their data from the present viewpoint, 
it is suggested that overestimation of fs is likely to replace time- 
shrinking if fs is far above this range, and this is worth investigating 
immediately. The same argument may hold also for the tactile 
modality (Hasuo etal., 2014). 

One big problem for our future research is that the experimental 
data are not always very stable in the present paradigm, and this can 
be the case in other related paradigms. The individual differences 
were sometimes as big as the effects to be investigated. Fortunately, 
our present purpose was simple, i.e., to examine whether system- 
atic overestimation of the second time interval would or would not 
appear; we somehow reached tentative conclusions. If the many 
issues suggested here are to be investigated in the future, however, 
we will need more sophisticated methods. One possible solution 
is to design experiments that enable us to perform some multi- 
variate analyses. Another possibility is to obtain a lot of data from 
a few participants, and to compare results in different conditions 
for each individual participant. 

We investigated the perception of empty time intervals marked 
by tone bursts, and employed temporal patterns of two neigh- 
boring time intervals. Our research question was whether the 
overestimation of the second time interval would replace the 
underestimation (time-shrinking) if the difference between the 
neighboring time intervals was increased. The overestimation 
took place very systematically when the first time interval was 
80-280 ms, and its amount sometimes exceeded 100 ms, indicat- 
ing that this was an important phenomenon related to rhythm 
perception. It is very likely that similar temporal patterns appear 
often in music. Assimilation and contrast, which Fraisse (1978) 
considered to be two important principles to construct rhythm, 
were manifested in an in vitro situation. 
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